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Pattern storage by a single neuron is revisited. Generaliz- 
ing Parisi's framework for spin glasses we obtain a variational 
free energy functional for the neuron. The solution is demon- 
strated at high temperature and large relative number of ex- 
amples, where several phases are identified by thermodynam- 
ical stability analysis, two of them exhibiting spontaneous full 
replica symmetry breaking. We give analytically the curved 
segments of the order parameter function and in representa- 
tive cases compute the free energy, the storage error, and the 
entropy. 
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Statistical physical modeling of neural networks a- 
chieved much success in the description of neural phe- 
nomena, ranging from storage and retrieval in memory 
networks to learning and generalization in feed-forward 
networks to unsupervised learning ||^. Whereas some 
models for a single neuron are admittedly oversimpli- 
fied from the biological viewpoint, when networked they 
exhibit a variety of neural functions, performed by liv- 
ing systems and demanded from artificial designs. In 
this Letter we study a single perceptron-type neuron's 
memorization ability, crucial for the understanding of 
networked systems. When the number of synaptic cou- 
plings of a neuron becomes large the storage problem can 
be described via the statistical mechanical framework in- 
troduced by Gardner and Derrida [^,1). Since then the 
neuron is well understood below capacity, the region be- 
yond it, however, remained the subject of continuous re- 
search and debate We claim that the framework 
presented here carries the exact statistical mechanical so- 
lution, which we illustrate on a partly analytically treat- 
able limiting case. Networks beyond saturation are long 
known to have complex features; here we show that even 
a single neuron can exhibit extreme complexity. 

We consider the McCulloch-Pitts model neuron 0], 



^ = sign(;i), h^N-^^^y^" J^^' 



(1) 



where J is the vector of synaptic couplings, S the input 
and ^ the response. The normalization was chosen so 
that h is typically of 0(1) when — > oo. Patterns to 
be stored are prescribed as pairs {S**, such that 

the neuron is required to generate in response to S'^. 
Given the ensemble of patterns, the local stability pa- 
rameter A** = h^i^^ obeys some distribution p(A) (see 
0). The ^-th pattern is stored by the neuron if the 



actual response signal from Eq. (|l|) equals the desired 
output i. e., A^ > 0. The number of patterns M 
is generically of order A^, so a = M/N is an intensive 
parameter. For the sake of simplicity, we generate the 
5f-s independently from a normal distribution, consider 
= ±1 equally hkely, and choose the spherical prior 
constraint |J| — \/N . The cost function to be minimized, 
i. e., the Hamiltonian, is the sum of errors committed on 
the patterns. The error on the /i-th pattern is measured 
by a potential F(A''), taken here to be zero for argu- 
ments larger than a given k and decreasing elsewhere . 
Storage as defined above corresponds to k = 0, while a 
K > means a stricter requirement on the local stability 
A and ensures a finite basin of attraction for a memo- 
rized pattern during retrieval. The Hamiltonian defines 
through gradient descent a dynamics in coupling space. 
Specifically, V{y) = [n — y)'' 9{k — y) corresponds to the 
perceptron and adatron rules for b = 1,2, respectively. 
There is no such dynamics in the case 6 = 0, but because 
of its prominent static meaning - the Hamiltonian counts 
the incorrectly stored patterns - we will consider that in 
concrete calculations. 

The Hamiltonian introduced above gives rise to a 
statistical mechanical system [|j resembling models of 
spin glasses with infinite-range interactions [ p^ . The 
microstates are configurations of synaptic couplings, 
quenched disorder is due to the randomly generated pat- 
terns, and the temperature T = represents the tol- 
erance to error of storage. The partition function is 

Z = j d^J,5(\/7V- |J|) exp |^-/3^^T/(A^)j . (2) 

For large N the replica method ||l^ yields the mean free 
energy per coupling ||,|,|| 

^ = lim ^ ~ ^^"^ = lim -min/(Q), (3) 

where ( ) stands for the average over patterns and 

/(Q) = /,(Q) + a/e(Q), (4a) 
/,(Q) = -(2/3)-ilndctQ, (4b) 

/e(Q) = In / /Vx d"y (27r)-" 



X exp ^(2/a) + ixy - ixQx) . (4c) 

The n X n matrix Q is symmetric and positive semidef- 
inite, with elements Qaa = 1 and — 1 < qab < 1- The 
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entropic term is specific to the spherical model, while 
the energy-term fe is independent of the prior constraint 
on the synapses. The mean error per pattern is 



(1) 



a dp 

while the entropy per synapsis 

s = (3{ae - /) 



dA p{A)V{A) 



(5) 



(6) 



has the usual thermodynamic meaning in coupling space. 

The extremization problem (^,^ was first solved with 
the assumption of rephca symmetry (RS) [§J|]. Be- 
yond capacity at zero temperature, however, Bouten 
showed by rectifying |^,|| that whenever the lo- 
cal stability distribution function p(A) exhibits a gap, 
there is an eigenvalue in negative infinity of the Hes- 
sian d'^f{Q)/dqabdqcd at the RS solution, so this is not 
a minimum m (|). Such is the case for the potential 
V{y) = 9{y — K). The one step replica symmetry breaking 
(1-RSB) ansatz was considered for T — 0, yielding a p{A) 
different from the RS result, and, as demanded from an 
improved solution, a larger energy In the ground 

state beyond capacity, where all qab 1, an eigenvalue of 
negative infinity has been found recently for any i?-step 
RSB (i?-RSB), and for illustration the 2-RSB solution 
computed ||^. The results show a slight improvement 
over 1-RSB in the energy and a significant difference in 
the scaled elements of Q, but also the 2-RSB ground state 
turned out to be unstable. Ref. Q in fact implied that a 
gap in /o(A) at T = means the instability of all i?-RSB 
solutions with R finite. 

In order to treat the storage problem of the neu- 
ron we technically generalize Parisi's method for the 
Sherrington-Kirkpatrick (SK) model of spin glasses (see 
pOf). By Parisi's choice of Q and his continuation rule 
in the n — *■ limit, the SK free energy was expressed in 
terms of an order parameter function. An elegant and 
useful re- formulation was due to ||Tl| ], whose free energy 
functional for the SK problem incorporated both Parisi's 
and Sompolinsky's partial differential equations (PPDE 
and SPDE, resp.). Its analog was used for the Little- 
Hopfield (LH) memory network in fl^ . For the neuron, 
we adopt Parisi's form for Q, momentarily as an ansatz, 
but thcrmodynamical stability analysis reported about 
later amounts to its consistency check. Our calculations 
show that despite the significant differences between the 
SK and the neuron Hamiltonians and those between the 
'hard' terms in the replica free energies, the variational 
free energies are remarkably similar. We obtain 

/ = max extr [/, + a{U + + fi^^)] , (7a) 
^(g) f(g,y),P(<i,y) 

fs - -m~' Cdq [D{q)-' - (1 - q)-'] , (7b) 



/e = /(0,0). 



(7c) 



dq I dyP{q,y) 



J a 

Jo 

/(2)= / dyP{l,y) [V{y)-f{l,y)]. 



(7d) 
(7e) 



The minimization in (^) turned to maximization due to 
its interchange with the n ^ limit [|oj . Here and later 
h = dh/dq and h' = dh/dy. The x{q) is the inverse of 
Parisi's order parameter function, i. e., it gives the prob- 
ability that the overlap of the synaptic vectors from two 
replicas is smaller than q, and D{q) = J^dqx{q) is the 
continuation of the spectrum of the matrix Q for n ^ 0. 
The range 1 > g > is now included in the ansatz, that 
should be verified later. The auxiliary functionals /d^'^"* 
carry the Lagrange multiplier field P{q, y) and thus van- 
ish at stationarity. Variation by P{q, y) makes the field 
/(g, y) satisfy the PPDE, which can be read off from (^), 
and that by P(l, y) fixes the initial condition through 
(0). So /(g, y) evolves from g = 1 to g = and its final 
value gives the energy term in (^) . Stationarity in terms 
of /(q, y) and /(O, y) leads to the SPDE 

P(<Z, y) = iP"(<z, y) + (3x{q) [P(q, y) f{q, y)]' , (8) 



evolving from P(0, 



= S{y) until q = 1. Comparison 
with the SK model 1 11 1 , its p-spin generalization 10 , and 



the LH network jl^ shows that the respective PDE-s and 
P(0, y) are the same, but in our case a general initial con- 
dition /(I, y) — V{y) is taken. In fact, the 'hard' term of 
the SK replica free energy is formally a special case of (^) 
if V{y) = In 2coshy. Variation of ( |7a| ) in terms of explicit 
occurrences of x{q) yields (2/3)^^ /p dq F{q, [x{q)]) Sx{q), 
where 



dq 



D{qf 



7 / dyP{q,y)nq,yf (9) 



is simultaneously a function of q and a functional of 
x{q), with 7 = apP' . So wherever x{q) > stationar- 
ity requires that F = 0. If x{q) = m, < m < 1, 
in an interval / then stationarity in terms of m leads 
to MaxweU's rule Jjdq F{q,[x{q)]) ^ 0. The ii-RSB 



ansatz involves a sequence q^ 

v^fl / (R) (-R) 
L;fc=o('^fc+l-™fc 



x{q) 



< 



< gj^' and has 



nif. ')9{q — q\. ), with m. 



< 



< 



_ < m^j^J^i — 1. It is naturally incorporated into 
the above scheme: required is F = at each of the points 



(R) 
% 



and so is the Maxwell rule in the intervals 



between them (c/. |15| in a special case). Note that the 
free energy can be written in short as max^(-g)[/s -1- af^] 



with (7b T^), where f{q,y) satisfies the PPDE with the 
initial condition as above; that corresponds to Parisi's 
original formulation. 

Thcrmodynamical stability analysis requires the diag- 
onalization of the Hessian of /(Q) in Eq. (^. Based on 
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the general expression of Ref. ||T^ we calculated a subset 
of eigenvalues from the replicon sector of the i?-RSB, in- 
cluding A^^^ (i?) = , that derives from states 
in the same smallest cluster. The A^^^^ (R) is typically de- 
cisive for stability [|l5|,0, and becomes negative infinity 
at T = for any i?-RSB with finite R if p(A) has a gap 
1^,1). Concerning the maximizing x{q) of (0), if x{q) > 
in an interval / then the continuation of the aforemen- 
tioned subset is X{q) = F{q, [x{q)]), so A(g) = in /, 
thus zero modes are present. This is a generic property 
of a Parisi phase . 

The distribution of the local stability A is found to be 
of a remarkably simple form jl^ 



p(A) = F(l,A). 



(10) 



That sheds light on the physical meaning of the auxiliary 
field P{q,y): y is the local stability at an intermediate 
generation of the ultrametric tree and P{q, y) its proba- 
bility distribution. The analogy with the local magnetic 
field in the SK and LH models [ pl] , p"8yi^ ] is apparent. 

Classic neural modeling focuses on T = 0. To solve 
that problem, however, extensive numerical work may 
be necessary. On the other hand, in the limit a, T —^ oo 
while 7 is kept finite, we can calculate x{q) wherever it 
deviates from the step-like shape, thence other analytic 
results follow. By resolving the PPDE and the SPDE 
perturbatively we obtain /(q, y) and P{q, y) as function- 
als of a;(g) to 0(/3^), yielding explicit functional forms for 
the free energy (Q) as well as for (||) . Another possibility 
is first expanding (^ in (3 and then applying the Parisi 
ansatz. Either way we arrive at 



/3'/ = 0o+/?max[0i]+O(/32) 

x{q) 



00 = l^JW{Q) 

= I / dqx{q)W{q) 



W{q) = / / dh 



exp(-i|tn 



2-K 



(11a) 

(lib) 
(11c) 

(lid) 



V^(ni-t)V^(n2-t), (lie) 



where |ni^2| — 1 and rii • n2 = q. The functional (11c) 
happens to be equivalent with the free energy in Nieuwen- 
huizen's generalization of the spherical SK-type spin glass 
model jl|]. Formula (|) is in leading order 



(12) 



F{qM<i)])= / dqD{q)-^ ^-iW{q), 
Jo 

thus for a continuous x{q) with x{q) > one has 



x{q)^h-^'^W{q)W{q) 



-3/2 



(13) 



c/. Eq. (9) in [|9|. Various trial functions x{q), such 
as an i?-RSB, or, Parisi's ansatz of a continuous order 



parameter function between two plateaux (such a clas- 
sic Parisi phase will be referred to as SG-I), can be for- 
mulated by means of (12). We calculated the full set 
of repHcon eigenvalues of i?-RSB based on iQ. With 
r = 0, . . . , i? — 1 and fc, ^ = r + 1, . . . , i? we have 

A(r;fc,0 = D{q[''^)-^D{qf^)-' - -iWiqi""^), (14) 

and \^^''>{R) is obtained if g^'' is substituted for all q-s 
in (p^. We studied the example V{y) = 9{k — y) when 

W{q) = (2^)-i(l - cxp («V(1 + q)) . (15) 

Four distinct phases are found and depicted on Fig. [|. 
At the boundary of the transition RS — SG-I, further- 
more, at the RS — 1-RSB line for k < K2, if the border is 
approached from the RSB phase, the x{(i) function con- 
verges for each < g < 1 to the RS value g^^-*. Here the 
3rd derivative of the mean free energy is discontinuous. 
On the other hand, for k > K2, if the RS-l-RSB line 



is approached from the RSB side then q^ 



(1) 



7("' but 



,(0) 



The plateau value m^^^ — > 1 so the limits 
of x{q) from the two phases differ at one point g = 1. 
At that transition the 2nd derivative of the free energy 
is discontinuous. This phenomenon is analogous to the 
RS — 1-RSB transition in the random energy model (see 
|]lO|), and similar two types of segments of the RS — 1- 
RSB borderline were identified in the spherical, p-spin SK 
model by The RS — SG-I boundary is analogous to 
the Parisi transition in the SK model. We found a fourth 
phase, where x{q) is like an SG-I curve joined with a 1- 
step function. It is of the same type as the phase PC II of 
the Potts spin glass and the low-temperature state 
of the p-spin SK model llj], furthermore, it is analogous 
to the phase SG-IV of Q . The borderline A^^^ (0) = 
of local stability of the RS state, i. e., the de Almeida— 
Thouless (AT) curve, coincides with the border of the RS 
phase for k < K2 but enters the RSB phases for larger 
K-s. However, whenever RS and RSB states coexist, we 
find that the RSB state maximizes the free energy func- 
tional (|^) . No coexistence between different types of RSB 
phases was observed. One characteristic x(q) function 
from each phase is shown on Fig. |2| Note that if x{q) 
has a curved segment, this is explicitly given by Eqs. 
(|l3| , p^ . For illustration, thermodynamic quantities are 
plotted along the k — line on Fig. |3[ We expect that 
for some finite temperatures similar phases exist, never- 
theless, in the ground state the phase diagram simplifies 
to the single borderline RS — SG-I, i. e., the known limit 
of capacity curve. The richness of the neural behavior 
for r — > cx) should be contrasted with the generic RS 
high-T-phase in SK-type disordered magnets. 

In conclusion, we have put forth an exact description 
of storage by a single neuron in terms of a variational 
free energy, the solution of which wc demonstrated in the 
high T limit with the error counting potential. Storage 
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beyond capacity with other error measures, learning and 
generalization of unlearnable tasks, storage by networked 
neurons, and frustrated phases in general, are natural 
directions for future investigations. 
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FIG. 1. Phase diagram for the potential V{y) = 9{k. — y) 
in t he (7 , k) plane for high T by numerical maximization of 
Eq. (11c). The full lines separate phases with different types 
The RS, 1-RSB, SG-IV, and SG-I phases 
b, c, and d, respectively. The AT curve is 
2.38 and to the right of 



of global maxima, 
are indicated by a, 0, c, 
the RS phase boundary for k < K2 
the arrow it analytically continues in the dashed line. 



FIG. 2. The x{q) function at 
marked on Fig. |l| by crosses. 



representative points as 



FIG. 3. The entropy s from Eq. (^), the free energy term ( 



from Eq. 



and the enlarged correction ei = r(i — e) for 



the energy (|5|) in the high T limit. The RS — SG-I transition 
is marked by an arrow. The dashed lines correspond to the 
thermodynamically unstable RS state beyond this transition 
point. 
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